The analysis below is for Baltimore homeownership incentive programs including:

Weatherization and repairs will be added once geocoding is completed.

The primary research questions are:

Loading required package: sp

Homeownership

All Homeownership Incentive Distributions, 2016-2018

Each incentive distribution was geocoded to repair/replace neighborhood assignment and to provide point locations for any later clustering or hot spot analysis.

Funding distribution by HMT group

The summary table below gives the percent of program funding that was distributed to each housing market type. Live Near Your Work is utilized primarily in strong/healhty markets, while 85% of CDBG funding goes to middle neighborhoods.

incen %>% 
  group_by(prog.type, hmt.group) %>%
  summarise(total = sum(as.numeric(amount))) %>%
  mutate(total = ifelse(is.na(total), 0, total)) %>%
  filter(!is.na(prog.type)) %>%
  ungroup() %>%
  group_by(prog.type) %>%
  mutate(pct = percent(total / sum(total))) %>%
  select(-total) %>%
  spread(key = prog.type, value = pct)

The map and table below present a program “utilization” metric - the number of incentives used per 100 home sales.

The summary table below gives program “utilization” by housing market typology tier.

Program demographics

Effect of Programs

Error: `by` required, because the data sources have no common variables
Call `rlang::last_error()` to see a backtrace

Weatherization + Repair

[needs geocoded]

---
title: "Homebuying Incentive Programs"
subtitle: "Department of Housing and Community Development, City of Baltimore"
author: "Justin Elszasz, Mayor's Office of Innovation"
email: "justin.elszasz@baltimorecity.gov"
date: "Monday, March 25, 2019"
output:
  html_notebook:
    code_folding: hide
    fig_height: 5
    fig_width: 10
    toc: yes
    toc_depth: 2
editor_options: 
  chunk_output_type: inline
---

The analysis below is for Baltimore homeownership incentive programs including:
 
- Community Development Block Grants
- Live Near Your Work
- Vacants to Value

Weatherization and repairs will be added once geocoding is completed.

The primary research questions are:

- Where are each of the programs utilized and are there concentrations?
- What programs are serving strong, middle, and distressed neighborhoods, respectively?

```{r setup, include = FALSE, echo = FALSE, message = FALSE, cache = TRUE}
knitr::opts_chunk$set(echo = FALSE, warning = F, message = F, include = T,
                                 fig.width = 10, fig.height = 5)

options(scipen = 999)
```


```{r}
library(readxl)
library(ggmap)
library(leaflet)
library(htmltools)
library(RSocrata)
library(shiny)
source("../src/00_initialize.R")
```

```{r load.data, cache = T, warning = F, echo = F, message = F}
incen <- read_csv("../data/processed/homebuying_incentives/homebuying_incentive_programs.csv", na = c("NA"))

#hoods <- get_neighborhood_boundaries()
hoods <- readRDS("../data/processed/hoods.Rds")
hmt.by.hood <- read_excel("../data/raw/hmt/HMT by Neighborhood 2017.xlsx") %>%
  transmute(neighborhood = Neighborhood,
            predominant.code = `Predominant Code Ignoring Non-Residential`)
sales <- load_sales_data(load.cache = T)

#real.prop.url <- "https://data.baltimorecity.gov/resource/6act-qzuy.json"
#real.prop <- read.socrata(real.prop.url, app_token = VARS$SOCRATA_TOKEN)
real.prop <- read_rds("../data/processed/real_prop.Rds")

```

```{r}
hoods@data <- hoods@data %>% left_join(hmt.by.hood, by = c("label" = "neighborhood"))
```

```{r}
hmt.levels <- c("healthy", "upper middle", "lower middle", "distressed", "other")

incen <- incen %>% 
  mutate_at(
    vars(prog.type, hmt.group, predominant.code, label, gender, race, ethnicity),
    funs(as.factor)
  ) %>%
  mutate(hmt.group = fct_relevel(hmt.group, hmt.levels))

```

```{r}
sales <- sales %>% rename(sales.block = Block, sales.lot = Lot)
real.prop <- real.prop %>% rename(real.block = block, real.lot = lot)
```

```{r}
real.prop <- real.prop %>%
  mutate(real.block.clean = gsub("^0+", "", real.block),
         real.lot.clean = gsub("^0+", "", real.lot))

sales <- sales %>%
  mutate(sales.block.clean = gsub("^0+", "", sales.block),
         sales.lot.clean = gsub("^0+", "", sales.lot)) 
```

```{r}
# filter for principal residence and "normal" purchases
sales <- sales %>%
  left_join(real.prop, 
            by = c("sales.block.clean" = "real.block.clean",
                   "sales.lot.clean" = "real.lot.clean")
            ) %>%
  filter(`How Conveyed` == 1,
         !grepl("NOT", rescode))
```


# Homeownership

### All Homeownership Incentive Distributions, 2016-2018

Each incentive distribution was geocoded to repair/replace neighborhood assignment and to provide point locations for any later clustering or hot spot analysis.

```{r out.width = "100%", fig.width = 8, fig.height = 6}
plot.colors <- c(iteam.colors[1], iteam.colors[4], iteam.colors[5])


pal <- colorFactor(plot.colors,
                   domain = incen$prog.type)

hoods.labels <- paste0(
  hoods$label,
  "<br>2017 Housing Market Typology: ", hoods$predominant.code
)

labels <- paste0(
  incen$prog.type,
  "<br>", incen$house.num, " ", incen$street, " ", incen$street.type,
  "<br>Amount: $", incen$amount
  
)

leaflet() %>%
  setView(lng = -76.6, lat = 39.3, zoom = 11) %>%
  addProviderTiles(providers$Stamen.TonerLite) %>% 
  addPolygons(data = hoods,
              stroke = T,
              fill = T,
              fillOpacity = 0,
              color = "black",
              weight = 3,
              label = ~lapply(hoods.labels, HTML),
              opacity = .4) %>%
  addCircleMarkers(data = incen %>% filter(prog.type == "LNYW"),
                   radius = 1,
                   color = iteam.colors[1],
                   label = ~lapply(labels, HTML),
                   group = "LNYW"
  ) %>%
  addCircleMarkers(data = incen %>% filter(prog.type == "CDBG"),
                   radius = 1,
                   color = iteam.colors[4],
                   label = ~lapply(labels, HTML),
                   group = "CDBG"
  ) %>%
  addCircleMarkers(data = incen %>% filter(prog.type == "V2V"),
                   radius = 1,
                   color = iteam.colors[3],
                   label = ~lapply(labels, HTML),
                   group = "V2V"
  ) %>%
  addLayersControl(overlayGroups = c("LNYW", "CDBG", "V2V"),
                   options = layersControlOptions(collapsed = FALSE))

```

### Funding distribution by HMT group

The summary table below gives the percent of program funding that was distributed to each housing market type. Live Near Your Work is utilized primarily in strong/healhty markets, while 85% of CDBG funding goes to middle neighborhoods.


```{r echo = T}
incen %>% 
  group_by(prog.type, hmt.group) %>%
  summarise(total = sum(as.numeric(amount))) %>%
  mutate(total = ifelse(is.na(total), 0, total)) %>%
  filter(!is.na(prog.type)) %>%
  ungroup() %>%
  group_by(prog.type) %>%
  mutate(pct = percent(total / sum(total))) %>%
  select(-total) %>%
  spread(key = prog.type, value = pct)
```




```{r}
# Justin to add roll up of dollar amounts

hoods@data <- hoods@data %>%
  left_join(
    incen %>% 
      count(label, prog.type) %>% 
      spread(key = prog.type, value = n),
    by = c("label" = "label")
  ) %>%
  left_join(
    sales %>% 
      filter(deed.date >= "2015-07-01",
             deed.date <= "2018-06-30") %>% 
      count(neighborhood) %>%
      rename(sales.16_18 = n),
    by = c("label" = "neighborhood")
  )
```

```{r}
hoods@data <- hoods@data %>%
  mutate_at(
    vars(CDBG, LNYW, V2V, sales.16_18),
    funs(replace(., is.na(.), 0))
  ) %>%
  mutate(
    lnyw.per.100sales = 100 * LNYW / sales.16_18,
    cdbg.per.100sales = 100 * CDBG / sales.16_18,
    v2v.per.100sales = 100 * V2V / sales.16_18
  )
```

The map and table below present a program "utilization" metric - the number of incentives used per 100 home sales.

The summary table below gives program "utilization" by housing market typology tier.

```{r}
hoods@data %>%
  mutate_hmt_group(predominant.code) %>%
  mutate(hmt.group = fct_relevel(hmt.group, hmt.levels)) %>%
  group_by(hmt.group) %>%
  summarise(
    sales.16_18 = sum(sales.16_18),
    lnyw.count = sum(LNYW),
    lnyw.per.100sales = 100 * lnyw.count / sum(sales.16_18),
    cdbg.count = sum(CDBG),
    cdbg.per.100sales = 100 * cdbg.count / sum(sales.16_18),
    v2v.count = sum(V2V),
    v2v.per.100sales = 100 * v2v.count / sum(sales.16_18)) %>%
  mutate_at(
    vars(lnyw.per.100sales, cdbg.per.100sales, v2v.per.100sales),
    funs(round(.,1)))
```

```{r out.width = "100%", fig.width = 8, fig.height = 6}
#varname <- "lnyw.per.100sales"

  bins <- c(0, 0.1, 0.5,  1, 5, 10, 50, 100)

  pal <- colorBin("Blues",
                  domain = c(0, 100),
                  bins = bins)

lnyw.labels <- paste0(
  hoods$label,
  "<br>2017 Housing Market Typology: ", hoods$predominant.code,
  "<br>LNYW per 100 sales: ", as.character(round(hoods$lnyw.per.100sales, 1))
)

cdbg.labels <- paste0(
  hoods$label,
  "<br>2017 Housing Market Typology: ", hoods$predominant.code,
  "<br>CDBG per 100 sales: ", as.character(round(hoods$cdbg.per.100sales, 1))
)

v2v.labels <- paste0(
  hoods$label,
  "<br>2017 Housing Market Typology: ", hoods$predominant.code,
  "<br>V2V per 100 sales: ", as.character(round(hoods$v2v.per.100sales, 1))
)

leaflet() %>%
  setView(lng = -76.6, lat = 39.3, zoom = 11) %>%
  addProviderTiles(providers$Stamen.TonerLite) %>%
    addPolygons(data = hoods,
              color = "black",
              weight = 2,
              fill = T,
              fillOpacity = 0) %>%
  addPolygons(data = hoods,
              fillOpacity = .5,
              fillColor = ~pal(hoods$lnyw.per.100sales),
              color = "gray10",
              weight = 2,
              label = ~lapply(lnyw.labels, HTML),
              group = "LNYW") %>%
  addPolygons(data = hoods,
              fillOpacity = .5,
              fillColor = ~pal(hoods$cdbg.per.100sales),
              color = "gray10",
              weight = 2,
              label = ~lapply(cdbg.labels, HTML),
              group = "CDBG") %>%
  addPolygons(data = hoods,
              fillOpacity = .5,
              fillColor = ~pal(hoods$v2v.per.100sales),
              color = "gray10",
              stroke = T,
              opacity = 1,
              weight = 2,
              label = ~lapply(v2v.labels, HTML),
              group = "V2V") %>%
  addLegend(pal = pal, values = bins, title = "Incentives per 100 Sales") %>%
  addLayersControl(baseGroups = c("LNYW", "CDBG", "V2V"),
                   options = layersControlOptions(collapsed = FALSE))

```



### Program demographics

```{r fig.width = 6, fig.height = 3}
incen %>%
  ggplot() +
  geom_density(aes(household.income, color = prog.type)) +
  theme_iteam_presentations() +
  xlim(c(0, 300000))
```

```{r}
incen %>%
  filter(household.income <= 200000) %>%
  ggplot() +
  geom_boxplot(aes(prog.type, household.income)) +
  theme_iteam_google_docs()
```
```{r}
incen %>%
  group_by(prog.type) %>%
  summarise(median.income = median(household.income, na.rm = T),
            mean.income = mean(household.income, na.rm = T))
```

```{r}
incen %>%
  count(prog.type, race) %>%
  group_by(prog.type) %>%
  mutate(pct = percent(n / sum(n))) %>%
  select(-n) %>%
  spread(key = prog.type, value = pct)
```

# Effect of Programs

```{r}
incen.analytics <- incen %>%
  transmute(label = label,
            hmt.group = hmt.group,
            prog.type = prog.type,
            price.income.ratio = sales.price / household.income,
            incen.income.ratio = amount / household.income,
            incen.price.ratio = amount / sales.price)
  
```


```{r}
incen.analytics %>%
  ggplot(aes(incen.price.ratio)) +
  geom_density(aes(color = prog.type)) +
  xlim(c(0, .25))
```

```{r}
incen.analytics %>% 
  group_by(prog.type) %>%
  summarise(median.incen.price.ratio = median(incen.price.ratio, na.rm = T),
            mean.incen.price.ratio = mean(incen.price.ratio, na.rm = T))
```

```{r}
incen.analytics %>%
  group_by(hmt.group, prog.type) %>%
  summarise(median.incen.price.ratio = percent(median(incen.price.ratio, na.rm = T))) %>%
  spread(key = prog.type, value = median.incen.price.ratio)
```

```{r}
# justin start here and join the incentives to the sales
sales.incen <- sales %>%
  left_join(
    incen, 
    by = c())
```


# Weatherization + Repair

[needs geocoded]

